Search CORE

13 research outputs found

Sparse Multivariate Factor Regression

Author: Coates Mark
Kharratzadeh Milad
Publication venue
Publication date: 29/02/2016
Field of study

We consider the problem of multivariate regression in a setting where the relevant predictors could be shared among different responses. We propose an algorithm which decomposes the coefficient matrix into the product of a long matrix and a wide matrix, with an elastic net penalty on the former and an

\ell_1

penalty on the latter. The first matrix linearly transforms the predictors to a set of latent factors, and the second one regresses the responses on these factors. Our algorithm simultaneously performs dimension reduction and coefficient estimation and automatically estimates the number of latent factors from the data. Our formulation results in a non-convex optimization problem, which despite its flexibility to impose effective low-dimensional structure, is difficult, or even impossible, to solve exactly in a reasonable time. We specify an optimization algorithm based on alternating minimization with three different sets of updates to solve this non-convex problem and provide theoretical results on its convergence and optimality. Finally, we demonstrate the effectiveness of our algorithm via experiments on simulated and real data

arXiv.org e-Print Archive

Crossref

Multilevel multivariate predictive systems

Author: Kharratzadeh Milad
Publication venue: McGill University
Publication date
Field of study

In this thesis we develop various multivariate regression algorithms in a setting where the relevant predictors could be shared among different responses. These algorithms are novel either in the low-dimensional structure they impose, in the objective function they use, or in the phenomenon they model. The contributions of this thesis can be organized into three main categories. First, we study multivariate regression tasks where it is reasonable to believe that theresponses are related to factors, each of which is a sparse linear combination of the predictors. We propose an algorithm to perform joint dimension reduction and parameter estimation. This is done by decomposing the coefficient matrix into the product of a long matrix and a wide matrix, with an elastic net penalty on the former and an L1 penalty on the latter. We provide theoretical guarantees for our algorithm and show its superior performance over state-of-the-art algorithms by experiments on simulated and real data. We also provide two extensions of our method which provide extra information that can be used to gain more insight into the data. Second, we consider a generalized multivariate regression problem where responses are monotonic functions of linear transformations of predictors. We propose a semi-parametric algorithm based on the ordering of the responses which is invariant to the functional form of the transformation function. We prove strong consistency, provide a convergence rate, and show that our algorithm performs better than linear regression techniques in the presence of non-linearity in data. Finally, we introduce multi-layer, deterministic neural networks implementing probabilistic models of cognition. Our model learns to represent probabilities using realistic inputs in the form of occurrence patterns of events. We pair this model with a neural module applying Bayes' rule to form a comprehensive neural scheme to simulate human Bayesian learning and inference. Our model also provides novel explanations of base-rate neglect, a notable deviation from Bayes.Dans cette thèse, nous développons différents algorithmes de régression multivariée dans un contexte où les modèles de prédiction spécifiques pourraient être partagés entre les différentes réponses. Le côté novateur des algorithmes proposés parvient soit de leur structure de faible dimension, soit de la fonction de coût qu'ils utilisent, ou dans le phénomène qu'ils modélisent. Les contributions de cette thèse peuvent être organisées en trois catégories principales. Premièrement, nous étudions la régression multivariée dans les situations où les réponses sont liées à des facteurs, dont chacun est une combinaison linéaire et parcimonieuse des prédicteurs. Nous proposons un algorithme qui relise de façon jointe l'estimation des paramètres et la réduction de dimension. Dans ce but, la matrice des coefficients est décomposé dans un produit d'une matrice longue et d'une matrice large, avec une pénalité type filet élastique sur la matrice longue et une pénalité L1 sur la matrice large. Nous assurons des garanties théoriques pour notre algorithme et nous montrons ses performances supérieures par rapport aux algorithmes de l'état de l'art. Le gain en performance est démontré sur des données simulées ainsi que sur des données réelles. Nous assurons également deux extensions de notre méthode, qui fournissent des informations supplémentaires et qui conduisent à une meilleure compréhension des données. Deuxièmement, nous considérons un problème généralisé de régression multivariée où les réponses sont les prédicteurs transformé d'abord linéairement et ensuite par une fonction monotone. Nous proposons un algorithme semi-paramétrique basé sur l'ordonnancement des réponses, qui s'avère à être invariant à la forme fonctionnelle de la fonction de transformation. Nous prouvons une forte consistance, nous fournissons également un taux de convergence, et nous montrons que notre algorithme est plus performant que les techniques de régression linéaire en présence des non-linéarités dans les données. Enfin, nous introduisons un réseau de neurones multicouches et déterministe avec l'objectif de modéliser, d'un point de vue probabiliste, le processus de cognition. Notre modèle apprend à représenter des probabilités en utilisant des entrées réalistes sous la forme de modèles d'occurrence des événements. Nous apparions ce modèle avec un module neuronal appliquant la règle de Bayes pour former un système neuronal complet, afin de simuler l'apprentissage et l'inférence bayésienne de l'être humain. Notre modèle fournit également de nouvelles explications de l'oubli de la fréquence de base, qui représente une déviation par rapport au théorème de Bayes

eScholarship@McGill

Weblog analysis for predicting correlations in stock price evolutions

Author: Kharratzadeh Milad
Publication venue: McGill University
Publication date
Field of study

In this thesis, we use data extracted from many weblogs to identify the underlying relations of a set of companies in the Standard and Poor (S&P) 500 index. In order to do this, we define a pairwise similarity measure for the companies based on the weblog articles and then apply a graph clustering procedure. We show that it is possible to capture some interesting relations between the companies using this method. As an application of this clustering procedure, and motivated by the fact that many of the factors affecting stock market can be captured by our clustering, we propose a cluster-based portfolio-selection method which combines information from the weblog data and historical stock prices. Through simulation experiments, we show that our method performs better (in terms of risk measures) than cluster-based portfolio strategies based on the sectors of the companies or the historical stock prices. This suggests that the methodology has the potential to identify groups of companies whose stock prices are more likely to be correlated in the future.Dans cette thèse, nous utilisons des données extraites de nombreux weblogs pour identifier les relations sous-jacentes entre les entreprises de l'index Standard and Poor (S&P) 500. C'est dans ce but que nous définissons une mesure de similarité entre ces entreprises, basée sur les articles de weblogs puis utilisée pour une procédure de regroupement. Nous montrons que cette méthode permet de capturer des relations intéressantes entre les entreprises. Motivés par le fait que de nombreux facteurs qui régissent les marchés financiers sont capturés par notre modèle, nous proposons une méthode de sélection de portefeuille basée sur ces regroupements. Cette méthode combine les informations tirées des weblogs ainsi que l'historique des marchés financiers.A travers des simulations, nous montrons que notre méthode donne de meilleurs résultats (en terme de prise de risque) qu'une sélection de portefeuille basée uniquement sur les secteurs des entreprises ou sur l'historique de la Bourse. Ces résultats suggèrent que cette méthodologie a le potentiel d'identifier les regroupements d'entreprises dont les capitalisations boursières ont de fortes chances d'être corrélés

eScholarship@McGill

Weblog Analysis for Predicting Correlations in Stock Price Evolutions

Author: Coates Mark
Kharratzadeh Milad
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 20/05/2012
Field of study

We use data extracted from many weblogs to identify the underlying relations of a set of companies in the Standard and Poor (S\&P) 500 index. We define a pairwise similarity measure for the companies based on the weblog articles and then apply a graph clustering procedure. We show that it is possible to capture some interesting relations between companies using this method. As an application of this clustering procedure we propose a cluster-based portfolio selection method which combines information from the weblog data and historical stock prices. Through simulation experiments, we show that our method performs better (in terms of risk measures) than cluster-based portfolio strategies based on company sectors or historical stock prices. This suggests that the methodology has the potential to identify groups of companies whose stock prices are more likely to be correlated in the future

Association for the Advancement of Artificial Intelligence: AAAI Publications

Recommended from our members

Neural-network Modelling of Bayesian Learning and Inference

Author: Kharratzadeh Milad
Shultz Thomas
Publication venue: eScholarship, University of California
Publication date: 01/01/2013
Field of study

eScholarship - University of California

US Presidential Election: What Engaged People on Facebook

Author: Kharratzadeh Milad
Ustebay Deniz
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 03/05/2017
Field of study

We study Facebook posts published by major news organizations in the 10-month period leading to the 2016 presidential election. Our goal is to explore the topics related to the two major party candidates, Hillary Clinton and Donald Trump, and identify the ones that engaged the Facebook users the most. The engagement is measured by the total number of reactions, comments, and shares. Using topic modeling with Linear Dirichlet Allocation (LDA) on the Facebook posts, we identify the top 10 topics related to each candidate and then assess the audience engagement for these topics across 10 different news organizations. We use Hierarchical Bayesian Models (HBMs) to analyze the data, which allow us to partially pool the information across different sources

Association for the Advancement of Artificial Intelligence: AAAI Publications

Invariancy of Sparse Recovery Algorithms

Author: Arsalan Sharifnassab
Massoud Babaie-Zadeh
Milad Kharratzadeh
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Tropical forest loss enhanced by large-scale land acquisitions

Author: Davis Kyle Frankel
Dell’Angelo Jampel
D’Odorico Paolo
Estes Lyndon
Kehoe Laura J.
Kharratzadeh Milad
Koo Heejin Irene
Kuemmerle Tobias
Machava Domingos
Pais Aurélio de Jesus Rodrigues
Ribeiro Natasha
Rulli Maria Cristina
Tatlhego Mokganedi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Tropical forests are vital for global biodiversity, carbon storage and local livelihoods, yet they are increasingly under threat from human activities. Large-scale land acquisitions have emerged as an important mechanism linking global resource demands to forests in the Global South, yet their influence on tropical deforestation remains unclear. Here we perform a multicountry assessment of the links between large-scale land acquisitions and tropical forest loss by combining a new georeferenced database of 82,403 individual land deals—covering 15 countries in Latin America, sub-Saharan Africa and Southeast Asia—with data on annual forest cover and loss between 2000 and 2018. We find that land acquisitions cover between 6% and 59% of study-country land area and between 2% and 79% of their forests. Compared with non-investment areas, large-scale land acquisitions were granted in areas of higher forest cover in 11 countries and had higher forest loss in 52% of cases. Oil palm, wood fibre and tree plantations were consistently linked with enhanced forest loss while logging and mining concessions showed a mix of outcomes. Our findings demonstrate that large-scale land acquisitions can lead to elevated deforestation of tropical forests, highlighting the role of local policies in the sustainable management of these ecosystems

VU Research Portal

Clark University